The SENSEI Annotated Corpus: Human Summaries of Reader Comment Conversations in On-line News
نویسندگان
چکیده
Researchers are beginning to explore how to generate summaries of extended argumentative conversations in social media, such as those found in reader comments in on-line news. To date, however, there has been little discussion of what these summaries should be like and a lack of humanauthored exemplars, quite likely because writing summaries of this kind of interchange is so difficult. In this paper we propose one type of reader comment summary – the conversation overview summary – that aims to capture the key argumentative content of a reader comment conversation. We describe a method we have developed to support humans in authoring conversation overview summaries and present a publicly available corpus – the first of its kind – of news articles plus comment sets, each multiply annotated, according to our method, with conversation overview summaries.
منابع مشابه
Summarizing Multi-Party Argumentative Conversations in Reader Comment on News
Existing approaches to summarizing multi-party argumentative conversations in reader comment are extractive and fail to capture the argumentative nature of these conversations. Work on argument mining proposes schemes for identifying argument elements and relations in text but has not yet addressed how summaries might be generated from a global analysis of a conversation based on these schemes....
متن کاملFinding Good Conversations Online: The Yahoo News Annotated Comments Corpus
This work presents a dataset and annotation scheme for the new task of identifying “good” conversations that occur online, which we call ERICs: Engaging, Respectful, and/or Informative Conversations. We develop a taxonomy to reflect features of entire threads and individual comments which we believe contribute to identifying ERICs; code a novel dataset of Yahoo News comment threads (2.4k thread...
متن کاملWhat's the Issue Here?: Task-based Evaluation of Reader Comment Summarization Systems
Automatic summarization of reader comments in on-line news is an extremely challenging task and a capability for which there is a clear need. Work to date has focussed on producing extractive summaries using well-known techniques imported from other areas of language processing. But are extractive summaries of comments what users really want? Do they support users in performing the sorts of tas...
متن کاملGeneral Versus Specific Sentences: Automatic Identification and Application to Analysis of News Summaries
In this paper, we introduce the task of identifying general and specific sentences in news articles. Instead of embarking on a new annotation effort to obtain data for the task, we explore the possibility of leveraging existing large corpora annotated with discourse information to train a classifier. We introduce several classes of features that capture lexical and syntactic information, as wel...
متن کاملMultilevel Annotation of Agreement and Disagreement in Italian News Blogs
In this paper, we present a corpus of news blog conversations in Italian annotated with gold standard agreement/disagreement relations at message and sentence levels. This is the first resource of this kind in Italian. From the analysis of ADRs at the two levels emerged that agreement annotated at message level is consistent and generally reflected at sentence level, and that the structure of d...
متن کامل